Introduction to R
for Social Scientists

Workshop Day 1A | 2022-07-25
Jeffrey M. Girard | Pitt Methods

Overview

Instructor

Jeffrey Girard, PhD
www.jmgirard.com
jmgirard@ku.edu

Background

  • Assistant Professor, University of Kansas
  • Research Postdoc, Carnegie Mellon University
  • PhD Student, University of Pittsburgh

Research Areas

  • Psychological Assessment
  • Affective/Interpersonal Communication
  • Applied Statistics and Machine Learning
  • Data Science and Software Engineering

R Rationale

  1. Think of your computer as the engine of a car
    • It provides raw power for computation
  1. The R language is like the controls for the car
    • It lets you apply and direct that power
  1. RStudio is like a fancy dashboard for the car
    • It adds extra information and convenience
  1. An R package is like an add-on for the car
    • It adds new features and capabilities

Workshop Goals

  • This is a beginner-friendly workshop aimed at social scientists with little to no experience in R
  • My goal this week is to “teach you how to drive
  • Through lectures and live coding, you will learn the fundamentals of programming, data wrangling, visualization, and modeling in R
  • Through hands-on exercises, you will gain confidence in your skills and ability to learn
  • I will help you get your “driver’s license” but you will need to practice to become a pro

Workshop Roadmap

DAY 1A DAY 2A DAY 3A
Overview Program II Model I
Program I Wrangle III Model II
Practice I Practice III Practice V
DAY 1B DAY 2B DAY 3B
Wrangle I Visualize I Preview
Wrangle II Visualize II Open Q&A
Practice II Practice IV Consulting

Workshop Etiquette

Things to Do

  • Behave respectfully and with patience
  • Ask for help in chat or “raise hand”
  • Turn your camera on or off as desired
  • Come and go from workshop as needed

Things Not to Do

  • Don’t disparage yourself or others
  • Don’t stay confused for too long
  • Don’t unmute yourself when not talking
  • Don’t re-sell the workshop materials

Installing R

Windows

  1. Open a web browser
  2. Visit cloud.r-project.org
  3. Click “Download R for Windows”
  4. Click the “base” subdirectory link
  5. Click “Download R-4.X.X” (e.g., 4.2.1)
  6. Run the downloaded .exe file
  7. Select all the default options
  8. Complete the installation wizard

Mac OS

  1. Open a web browser
  2. Visit cloud.r-project.org
  3. Click “Download R for macOS”
  4. Click “R-4.X.X.pkg” (e.g., 4.2.1)
  5. Run the downloaded .pkg file
  6. Select all the default options
  7. Complete the installation wizard

Installing RStudio

Windows

  1. Open a web browser
  2. Visit rstudio.com/download
  3. Scroll down until you find the table under the “All Installers” section
  4. Find the row for “Windows 10/11”
  5. Click “RStudio-2022.XX.X-XXX.exe”
  6. Run the downloaded .exe file
  7. Select all the default options
  8. Complete the installation wizard

Mac OS

  1. Open a web browser
  2. Visit rstudio.com/download
  3. Scroll down until you find the table under the “All Installers” section
  4. Find the row for “macOS 10.15+”
  5. Click “RStudio-2022.XX.X-XXX.dmg”
  6. Run the downloaded .dmg file
  7. Drag the RStudio icon to your Applications folder (if you want)

RStudio Window

File Management

  • Projects are special folders on your computer
    • They contain all files related to a task
    • They keep everything together and organized
  • Projects make it easy to find and use your files
    • No need to specify long, annoying file paths
    • No need to worry about working directories
  • Projects make it easy to switch between tasks
    • They will remember exactly where you left off
    • You can even open multiple projects at once

Projects Live Coding

# Create a new Project
- Open the "File" menu in RStudio
- Select the "New Project..." option
- Select the "New Directory" option
- Select the "New Project" option
- Name the directory "R4SS" (or whatever)
- Browse to where to create your Project folder

# Create a new File
- Explore the Files tab in the Extras pane
- Create a New Blank File (e.g., a text file) as an example
- RStudio will automatically create it in your project folder
- Add some text to the example file (e.g., "Hello World")
- Close the text file with the "x" icon
- Reopen the text file from the Files tab

# Close and Open Project
- Open the "File" menu in RStudio
- Select the "Close Project" option
- Notice that your work is now gone
- Open the "File" menu in RStudio
- Select the "Open Project" option
- Browse to your project folder
- Open the R4SS.Rproj file
- Notice that your work is now back!

Program I

R will Grant your Wishes

  • R is like a well-meaning but overly literal genie

    • It has the power to grant almost any wish
    • But we must phrase our wishes carefully!
    • We will always get what we ask for…
    • …but not always what we wanted.
  • Mastering the R language means learning…

    • How to properly phrase commands
    • How to decipher error messages
    • How to view code from R’s perspective
    • How to detect and correct small mistakes

Communicating with R

  • The Console is like a chat window with R
    • You send a command to R and get a response
    • Neither side of the conversation is saved
  • A Script is like an email thread with R
    • You send many commands to R all at once
    • Only your side of the conversation is saved
  • RMarkdown is like a scrapbook with R
    • You can combine code and formatted text
    • Both sides of the conversation are saved

Console Live Coding

# Addition

10+3
10 + 3 # spaces are optional but recommended


# Subtraction

10 - 3


# Multiplication

10 * 3 # correct
10 x 3 # error


# Division

10 / 3 # correct
10 \ 3 # error


# Exponentiation

10 ^ 2


# Order of Operations

10 + 3 * 2
(10 + 3) * 2


# Negative Numbers

10 + -30


# Decimals and Fractions

1.234
(1 / 3)


# Leading and Trailing Zeros

09.870


# Large Numbers

9876543 # correct
9,876,543 # error
9 876 543 # error

RMarkdown Live Coding

# Create an RMarkdown Document
- Open the "File" menu in RStudio
- Select the "New File" option
- Select the "R Markdown..." option
- Keep the defaults (HTML Document) and hit "Ok"
- Open the "File" menu
- Select the "Save" option
- Note that it defaults to the project folder
- Give it a name like "Day 1A" (or whatever)
- Note that the file extension is .Rmd

# Remove the boilerplate content
- The top part of the notebook is called the "Header"
- Don't delete the header or the notebook won't work
- You can change the title but keep it in quotes
- Highlight and delete everything below the header

# Enter the Visual Editor
- Click on the "Visual" button at the top
- Check the box for "Knit on Save"
- Save by clicking the disk icon
- Look at the preview in the "Viewer" tab

# Add Formatted Text (i.e., Markdown)
- Below the header you can add formatted text
- Use the visual editor to add formatting easily
- Show how to add bold, italics, headers, etc.
- Mention that you can add links, figures, and tables too

# Add R Chunks (i.e., R code)
- Click the green "Insert a new code chunk" button (top right)
- Show how you can also do this quickly by typing / in Visual editor
- Inside the chunk, you can type R commands like a mini console
- Try doing some calculations in the chunk and hit the green arrow
- The answer appears right below the chunk!
- If we save and knit the document, it appears there too
- We can share the .html file with others
- It will include all the formatted text, code, and R's answers

Assignment

  • It is often useful to store data in named objects
    • This makes the data easier to use and re-use
    • This makes the code easier to write and read
  • Which command is easier to follow?
    1. Dial 7 8 5 8 6 4 0 8 4 1
    2. Call Office Phone
  • Named objects are created using assignment
    • Give a name then an arrow then the data

office <- 7858640841

Assignment Live Coding

# LESSON: Assigning and printing

x <- 2
x

# ==============================================================================

# USECASE: Using an object in math (a la algebra) 

x * 4

2 * 4

# ==============================================================================

# LESSON: You must use assignment to update an object

x

x + 1

x # still 2

x <- x + 1
x # updated to 3

# ==============================================================================

# USECASE: We can use the same object multiple times in a line

(10 + x - 1) / x

# ==============================================================================

# USECASE: We can also use an object to create another object

y <- 10 + x
y

# ==============================================================================

# USECASE: We can also use multiple objects in a line

y / x

Naming

  • Object names can only include:
    • Letters: a-Z
    • Numbers: 0-9
    • Underscores: _
    • Periods: .
  • Additional Rules:
    • Must start with a letter or period
    • Cannot contain spaces or dashes
    • Cannot contain other symbols
    • Names are case-sensitive (ageAge)

Naming Live Coding

# LESSON: Good names are a balancing act

x <- 93 # what is it?

rate <- 93 # too short

heart_rate_in_beats_per_minute <- 93 # too long

heart_rate_bpm <- 93 # just right

# ==============================================================================

# PITFALL: Don't try to include spaces or dashes in names

heart rate <- 93 # error

heart-rate <- 93 # error

# ==============================================================================

# PITFALL: Don't try to include special symbols

age@time2 <- 12 # error

age_time2 <- 12 # correct

# ==============================================================================

# PITFALL: Don't try to put a number or underscore first

heart_rate_1 <- 93 # correct

1_heart_rate <- 93 # error

_heart_rate <- 93 # error

# ==============================================================================

# LESSON: Object names are case-sensitive

heart_rate <- 93

Heart_rate <- 88

heart_rate # still 93

Heart_rate # a new object

Functions

  • Recipes allow chefs to cook up tasty treats
    • Recipes call for ingredients
    • Recipes involve one or more steps
    • Steps transform ingredients into treats
  • Functions are like customizable recipes
    • Functions call for inputs (“arguments”)
    • Functions involve one or more lines of code
    • Code transforms inputs into outputs
    • Using functions requires parentheses (usually)

out <- f(in1, in2)

Functions Live Coding

# USECASE: Function can perform a task more easily and readably

# TEMPLATE: output <- function_name(input)

9 ^ (1 / 2)

x <- sqrt(9)
x

# ==============================================================================

# LESSON: We can also use functions to transform objects

y <- 9

sqrt(y)

# ==============================================================================

# LESSON: We can even use functions to transform the result of calculations

2 / 3

round(2 / 3)

# ==============================================================================

# LESSON: We can customize what a function does using arguments

# TEMPLATE: output <- function_name(argument, argument_name = argument_value)

round(2 / 3, digits = 2)

round(2 / 3, digits = 3)

# ==============================================================================

# LESSON: Some arguments are optional because they have default values

round(2 / 3) # the default value for digits is 0

round(2 / 3, digits = 0)

Vectors

  • Vectors combine similar objects into a collection
    • I like to imagine a train pulling multiple cars
    • A vector is one object with many sub-objects
    • We refer to each sub-object as an element
  • Some functions transform each element in turn
    • Double the amount of cargo in every train car
  • Some functions summarize across elements
    • Calculate the total cargo across all train cars

v <- c(1, 2, 3)

Vectors Live Coding

# LESSON: We can combine multiple elements into a vector

# TEMPLATE: vector_name <- c(element1, element2, element3)

x <- 4 9 16 25 # error

x <- c(4, 9, 16, 25)
x

y <- c(2, 3)
y

# ==============================================================================

# LESSON: We can also combine multiple vectors and elements

c(x, y)

c(x, y, 20)

# ==============================================================================

# USECASE: Math operators will transform each element individually

x + 1

x * 3

x # but again, this won't be saved unless you use assignment

# ==============================================================================

# USECASE: Some functions will also transform each element individually

sqrt(x)

log(x)

# ==============================================================================

# USECASE: Other functions will summarize the vector with a single number

length(x)

sum(x)

mean(x)

Strings

  • When talking to R, we need a way to distinguish
    • Object/function names (e.g., the mean function)
    • Text/character data (e.g., the word mean)
  • Strings are R’s way of storing text data
    • Strings can store any characters (no rules!)
    • Strings are created and displayed with quotes
  • R has great tools for working with strings
    • Strings can be collected into vectors
    • Special functions can transform strings

name <- "John Doe"

Strings Live Coding

# USECASE: Strings are the main way to store character data in R
 
my_color <- red # error

my_color <- "red" # correct

# ==============================================================================

# USECASE: Strings can also store symbols not allowed in object names

dye <- "red#40"
dye

dyes <- c("red#40", "blue#02")
dyes

# ==============================================================================

# PITFALL: Many operations you can do to numbers won't work for strings

dyes + 1 # error

mean(dyes) # error

# ==============================================================================

# USECASE: But other operations work for both or even just for strings

length(dyes)

nchar(dyes)

dyes2 <- toupper(dyes)
dyes2

Packages

  • Cookbooks are a great way to learn to cook
    • They contain lots of recipes and instructions
    • Browse an online bookstore for a cookbook
    • Order it to add it to your personal bookshelf
    • To use, pull the cookbook off the shelf
  • Packages are like cookbooks for R
    • They contain helpful functions and datasets
    • Browse an online repository for a package
    • Install it to add it to your personal library
    • To use, load the package from the library

library("pkg_name")

Packages Live Coding

# USECASE: The stringr package adds a function to fix capitalization

students <- c("mary anne", "BENjamin", "Lee")

# ==============================================================================

# PITFALL: But we can't use that function without installing the package

str_to_title(students) # error

# ==============================================================================

# LESSON: Installing a package using RStudio

# - RStudio > Extras pane > Packages tab > Install button

# ==============================================================================

# PITFALL: We also need to load the package before we can use it

str_to_title(students) # error

# ==============================================================================

# LESSON: We load the package using library()

library("stringr")
str_to_title(students) #finally works!

# ==============================================================================

# LESSON: We can also keep our packages updated using RStudio

# RStudio > Extras pane > Packages tab > Update button

Practice I